20 research outputs found

    MACOC: a medoid-based ACO clustering algorithm

    Get PDF
    The application of ACO-based algorithms in data mining is growing over the last few years and several supervised and unsupervised learning algorithms have been developed using this bio-inspired approach. Most recent works concerning unsupervised learning have been focused on clustering, showing great potential of ACO-based techniques. This work presents an ACO-based clustering algorithm inspired by the ACO Clustering (ACOC) algorithm. The proposed approach restructures ACOC from a centroid-based technique to a medoid-based technique, where the properties of the search space are not necessarily known. Instead, it only relies on the information about the distances amongst data. The new algorithm, called MACOC, has been compared against well-known algorithms (K-means and Partition Around Medoids) and with ACOC. The experiments measure the accuracy of the algorithm for both synthetic datasets and real-world datasets extracted from the UCI Machine Learning Repository

    Fire in ice: two millennia of boreal forest fire history from the Greenland NEEM ice core

    Get PDF
    Biomass burning is a major source of greenhouse gases and influences regional to global climate. Pre-industrial fire-history records from black carbon, charcoal and other proxies provide baseline estimates of biomass burning at local to global scales spanning millennia, and are thus use- ful to examine the role of fire in the carbon cycle and climate system. Here we use the specific biomarker levo- glucosan together with black carbon and ammonium concen- trations from the North Greenland Eemian (NEEM) ice cores ◦◦ (77.49 N, 51.2 W; 2480ma.s.l) over the past 2000 years to infer changes in boreal fire activity. Increases in boreal fire activity over the periods 1000–1300 CE and decreases during 700–900 CE coincide with high-latitude NH temper- ature changes. Levoglucosan concentrations in the NEEM ice cores peak between 1500 and 1700 CE, and most levo- glucosan spikes coincide with the most extensive central and northern Asian droughts of the past millennium. Many of these multi-annual droughts are caused by Asian mon- soon failures, thus suggesting a connection between low- and high-latitude climate processes. North America is a primary source of biomass burning aerosols due to its relative prox- imity to the Greenland Ice Cap. During major fire events, however, isotopic analyses of dust, back trajectories and links with levoglucosan peaks and regional drought reconstruc- tions suggest that Siberia is also an important source of py- rogenic aerosols to Greenland

    Naive Bayes ant colony optimization for designing high dimensional experiments

    Get PDF
    In a large number of experimental problems, high dimensionality of the search area and economical constraints can severely limit the number of experimental points that can be tested. Within these constraints, classical optimization techniques perform poorly, in particular, when little a priori knowledge is available. In this work we investigate the possibility of combining approaches from statistical modeling and bio-inspired algorithms to effectively explore a huge search space, sampling only a limited number of experimental points. To this purpose, we introduce a novel approach, combining ant colony optimization (ACO) and naive Bayes classifier (NBC) that is, the naive Bayes ant colony optimization (NACO) procedure. We compare NACO with other similar approaches developing a simulation study. We then derive the NACO procedure with the goal to design artificial enzymes with no sequence homology to the extant one. Our final aim is to mimic the natural fold of 200 amino acids 1AGY serine esterase from Fusarium solani

    A Land-Use Perspective for Birdstrike Risk Assessment: The Attraction Risk Index

    Get PDF
    Collisions between aircraft and birds, birdstrikes, pose a serious threat to aviation safety. The occurrence of these events is influenced by land-uses in the surroundings of airports. Airports located in the same region might have different trends for birdstrike risk, due to differences in the surrounding habitats. Here we developed a quantitative tool that assesses the risk of birdstrike based on the habitats within a 13-km buffer from the airport. For this purpose, we developed Generalized Linear Models (GLMs) with binomial distribution to estimate the contribution of habitats to wildlife use of the study area, depending on season. These GLMs predictions were combined to the flight altitude of birds within the 13-km buffer, the airport traffic pattern and the severity indices associated with impacts. Our approach was developed at Venice Marco Polo International airport (VCE), located in northeast Italy and then tested at Treviso Antonio Canova International airport (TSF), which is 20 km inland. Results from the two airports revealed that both the surrounding habitats and the season had a significant influence to the pattern of risk. With regard to VCE, agricultural fields, wetlands and urban areas contributed most to the presence of birds in the study area. Furthermore, the key role of distance of land-uses from the airport on the probability of presence of birds was highlighted. The reliability of developed risk index was demonstrated since at VCE it was significantly correlated with bird strike rate. This study emphasizes the importance of the territory near airports and the wildlife use of its habitats, as factors in need of consideration for birdstrike risk assessment procedures. Information on the contribution of habitats in attracting birds, depending on season, can be used by airport managers and local authorities to plan specific interventions in the study area in order to lower the risk.Collisions between aircraft and birds, birdstrikes, pose a serious threat to aviation safety. The occurrence of these events is influenced by land-uses in the surroundings of airports. Airports located in the same region might have different trends for birdstrike risk, due to differences in the surrounding habitats. Here we developed a quantitative tool that assesses the risk of birdstrike based on the habitats within a 13-km buffer from the airport. For this purpose, we developed Generalized Linear Models (GLMs) with binomial distribution to estimate the contribution of habitats to wildlife use of the study area, depending on season. These GLMs predictions were combined to the flight altitude of birds within the 13-km buffer, the airport traffic pattern and the severity indices associated with impacts. Our approach was developed at Venice Marco Polo International airport (VCE), located in northeast Italy and then tested at Treviso Antonio Canova International airport (TSF), which is 20 km inland. Results from the two airports revealed that both the surrounding habitats and the season had a significant influence to the pattern of risk. With regard to VCE, agricultural fields, wetlands and urban areas contributed most to the presence of birds in the study area. Furthermore, the key role of distance of land-uses from the airport on the probability of presence of birds was highlighted. The reliability of developed risk index was demonstrated since at VCE it was significantly correlated with bird strike rate. This study emphasizes the importance of the territory near airports and the wildlife use of its habitats, as factors in need of consideration for birdstrike risk assessment procedures. Information on the contribution of habitats in attracting birds, depending on season, can be used by airport managers and local authorities to plan specific interventions in the study area in order to lower the risk

    A coordinate-exchange two-phase local search algorithm for the D- and I-optimal design of split-plot experiments

    No full text
    Many industrial experiments involve one or more restrictions on the randomization. In such cases, the split-plot design structure, in which the experimental runs are performed in groups, is a commonly used cost-efficient approach that reduces the number of independent settings of the hard-to-change factors. Several criteria can be adopted for optimizing split-plot experimental designs: the most frequently used are D-optimality and I-optimality. A multi-objective approach to the optimal design of split-plot experiments, the coordinate-exchange two-phase local search (CE-TPLS), is proposed. The CE-TPLS algorithm is able to approximate the set of experimental designs which concurrently minimize the D-criterion and the I-criterion. It allows for a flexible choice of the number of hard-to-change factors, the number of easy-to-change factors, the number of whole plots and the total sample size. When tested on four case studies from the literature, the proposed algorithm returns meaningful sets of experimental designs, covering the whole spectrum between the two objectives. On most of the analyzed cases, the CE-TPLS algorithm returns better results than those reported in the original papers and outperforms the state-of-the-art algorithm in terms of computational time, while retaining a comparable performance in terms of the quality of the optima for each single objectiv
    corecore